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Abstract.  It  has  been  1 0  years  since  Walt  Lipke  first  introduced  the  concept 
of  Earned  Schedule  (ES).  While  progress  has  been  made  in  understanding  the 
utility  of  ES  in  some  small  scale  and  limited  studies,  a  significant  analysis  of  ES 
in  DoD  acquisition  programs  is  missing.  This  paper  first  analyzes  whether  ES  and 
Earned  Value  Management  (EVM)  provide  fundamentally  different  information  for 
program  managers.  It  then  examines  which  technique,  ES  or  EVM,  provides  more 
timely  and  accurate  schedule  predictors  in  a  broad  spectrum  of  military  weapon 
system  programs.  We  find  ES  to  be  more  timely  and  accurate  both  in  software 
intensive  contracts  and  in  the  sample  size  as  a  whole. 


Data  Source 

The  data  for  this  analysis  is  from  the  Defense  Acquisition 
Management  Information  Retrieval  (DAMIR)  system.  DAMIR  is 
comprised  of  all  Contractor  Performance  Report  (CPR)  data  for 
major  DoD  acquisition  programs.  The  CPR  data  contains  the 
monthly  and  quarterly  performance  information  derived  from  the 
contractors  EVMS  system  for  all  Work  Breakdown  Structures 
(WBS)  within  each  contract  of  a  program.  Thus,  it  provides  the 
cost  and  schedule  status  for  the  contract  [9]. 

This  analysis  focuses  on  64  Acquisition  Category  (ACAT)  1 
aircraft  contracts  at  the  summary  level  (WBS  1).  The  programs 
comprising  the  dataset  have  completed  their  acquisition  phase, 
and  are  either  in  their  operational  phase,  or  have  been  retired 
from  the  Air  Force  fleet.  The  64  contracts  result  in  1 ,087  data 
points  in  the  full  analysis.  We  specifically  examine  the  software 
intensive  avionics  contracts  as  a  group,  in  addition  to  an  aggre¬ 
gated  analysis  of  all  64  contracts. 

Methodology  and  Results 


Background 

EVM  has  been  the  premier  method  of  program  management 
and  program  cost  forecasting  within  the  DoD  since  its  inception 
in  the  1960s.  However,  there  are  well-documented  limitations 
to  EVM  particularly  with  respect  to  schedule  analysis  [1].  These 
limitations  include:  1)  reporting  schedule  variance  in  terms  of 
dollars  rather  than  time  2)  the  regression  of  EVM  schedule 
efficiency  metrics  (SPI($))  to  1  as  projects  near  completion, 
despite  variable  schedule  performance  and  3)  the  regression 
of  EVM  schedule  variance  metrics  (SV($))  to  zero  as  projects 
near  completion.  For  practitioners  in  the  field,  these  issues 
make  traditional  EVM  schedule  analysis  unwieldy.  To  mitigate 
these  limitations,  Walt  Lipke  developed  the  concept  of  ES  as  an 
alternative  to  EVM  [1].  Lipke’s  ES  construct  measures  sched¬ 
ule  performance  with  analogous  earned  value  metrics  dubbed 
Schedule  Performance  Index  (SPI(t))  and  Schedule  Variance 
(SV(t))  where  (t)  indicates  the  metric  is  reported  in  time. 

But  the  question  remains:  Should  DoD  managers  utilize  ES 
as  a  preferred  schedule  analysis  technique?  Program  managers 
should  only  implement  ES  analysis  as  part  of  their  tool  kit  if  it 
provides  additional  benefit  beyond  the  established  EVM  tech¬ 
niques.  Thus,  the  answer  to  the  question  becomes  an  empirical 
matter.  Previous  studies  (Henderson  [2]  [3],  Lipke  [4],  Van- 
houcke  &  Vandevoorde  [5],  Rujirayanyong  [6],  Tzaveas,  Katsa- 
vounis  &  Kalfakakou  [7],  Lipke  [8]),  have  examined  the  efficacy 
of  ES,  but  these  studies  were  all  limited  by  their  extremely  small 
sample  size  or  lack  of  relevance  to  the  DoD. 

This  paper  overcomes  the  previous  literature  shortcomings  by 
analyzing  over  64  contracts  in  major  Air  Force  aircraft  acquisi¬ 
tion  programs  to  determine  whether  ES  provides  more  timely  and 
accurate  information.  These  contracts  include  software  intensive 
contracts  such  as  avionics  along  with  hardware  intensive  con¬ 
tracts  such  as  engines,  capturing  the  full  spectrum  of  an  aircraft 
acquisition  effort.  The  large  sample  size  and  direct  relationship  to 
military  programs  makes  the  results  of  this  analysis  directly  ap¬ 
plicable  to  DoD  software  and  hardware  program  managers. 
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Preliminary  Analysis 

The  first  question  to  answer  is  whether  ES  and  EVM  provide 
fundamentally  different  information  to  program  managers.  Once 
this  is  ascertained,  the  method  that  provides  better  informa¬ 
tion,  measured  in  this  paper  by  timeliness  and  accuracy,  can  be 
determined.  We  statistically  test  the  difference  between  ES  and 
EVM  through  a  paired  t-test  of  SPI($)  and  SPI(t).  A  paired  t-test 
measures  the  mean  difference  between  two  sets  of  numbers. 
The  null  hypothesis  is  that  there  is  no  difference  between  the 
methods.  Table  1  shows  the  results. 


t-Test:  Paired  Two  Sample  for  Means 

Variable  1 

Variable  2 

Mean 

0.939165476 

0.95750293 

Variance 

0.008831643 

0.006653895 

Observations 

1087 

1087 

Pearson  Correlation 

0.689419981 

Hypothesized  Mean  Difference 

0 

df 

1086 

tStat 

-8.623145392 

P(T<=t)  one-tail 

1.13734E-17 

t  Critical  one-tail 

1 .646257934 

P(T<=t)  two-tail 

2.27467E-17 

t  Critical  two-tail 

1.962150792 

Table  1 :  Paired  t-test  SPI($ )  vs  SPI(t) 


As  shown  in  Table  1 ,  the  p-value  of  the  t-test  is  2.27E-1 7,  well 
below  our  significance  level  of  0.05.  Therefore,  the  null  hypothesis 
is  rejected.  This  means  there  is  a  statistically  significant  likelihood 
that  ES  and  EVM  information  are  fundamentally  different  from 
each  other.  In  practical  terms,  this  indicates  that  utilizing  the  ES 
technique  provides  additional  information  to  the  program  man¬ 
ager.  The  question  then  becomes  whether  the  ES  information  is 
more  valuable,  as  measured  by  its  timeliness  and  accuracy. 
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Testing  Timeliness 

Metrics  help  managers  determine  when  a  problem  is  occur¬ 
ring  so  that  corrective  action  may  be  taken.  For  this  analysis,  a 
problem  was  defined  as  a  SPI($)  or  SPI(t)  <  0.90.  The  intent  of 
this  test  is  to  determine  whether  EVM  or  ES  is  an  earlier  detec¬ 
tor  of  problems  in  meeting  program  schedule  objectives.1 

The  initial  dataset  examined  is  the  subset  of  software  inten¬ 
sive  avionics  contracts.  Of  these  contracts  that  both  ES  and 
EVM  identify  as  a  problem,  EVM  identifies  the  problem  at  the 
1 8.87%  completion  point,  while  ES  identifies  the  problem  at  the 
1 6.88%  completion  point.  EVM,  therefore,  detects  about  2% 
earlier  than  ES.  However,  drawing  conclusions  based  on  this  is 
misleading.  Rather  the  analysis  necessitates  that  we  look  at  all 
the  avionics  contract  problems  detected,  even  if  only  one  of  ES 
or  EVM  detects  it.  See  Figure  1 . 

Figure  1  shows  that  ES  strictly  dominates  EVM.  ES  identifies 
more  problems  at  every  completion  point  of  the  contract.  More 
importantly,  at  the  earlier  stages  of  the  program,  ES  detects  more 
problems.  For  instance,  at  the  20%  completion  point,  ES  detects 
seven  programs  with  problems  while  EVM  only  detects  two. 

This  early  difference  in  detection  is  critical  as  it  allows  program 
mangers  to  take  corrective  action  early  in  the  program.  Figure 
1  also  demonstrates  a  second  area  where  ES  is  more  valuable 
than  EVM.  Note  that  around  the  2/3  program  completion  point, 
EVM  no  longer  detects  any  problems,  while  ES  remains  useful  in 
problem  detection  through  the  end  of  program  completion. 

Next  we  analyzed  the  full  64-contract  dataset.  The  total 
number  of  SPI(t)  and  SPI($)  values  below  0.90  were  analyzed 
at  each  of  the  following  program  completion  points:  20%,  40%, 
50%,  60%,  80%,  and  90%.  See  Table  2. 

Avionics  Comparison  of  Numbers 
of  SPI  Values  Below  .90 


with  previous  literature:  as  a  contract  approaches  its  completion 
point,  EVM  yields  an  SPI($)  value  that  approaches  1.0,  indicat¬ 
ing  that  the  program  is  on  schedule  even  if  it  is  not.  This  is  seen 
at  the  90%  completion  point  where  SPI(t)  correctly  found  20 
programs  to  be  “in  trouble,”  while  SPI($)  found  only  1. 

Testing  Accuracy 

Two  analyses  are  performed  to  compare  the  accuracy  of  ES 
and  EVM.  First,  we  measure  the  SPI($)  and  SPI(t)  in  relation  to 
the  final  schedule  result.  Whichever  method  is  closer  to  the  final 
contract  over/under  run  is  deemed  to  be  the  more  accurate 
technique.  The  results  for  the  avionics  subset  of  contracts  are 
shown  in  Table  3. 


Number  of  Occurrences 

Percentage  of  Overall 
Occurrences  {%) 

Earned  Value  Management 

107 

43.67 

Earned  Schedule 

126 

51.43 

EVM  =  ES 

12 

4.90 

Table  3:  Accuracy  of  ES  and  EVM  in  Avionics  Contracts 


Table  3  shows  that  ES  is  more  accurate  than  EVM  in  the  avionic 
subset.  There  is  approximately  an  8%  difference  between  the 
techniques  for  these  software  intensive  contracts.  While  this  find¬ 
ing  is  significant,  the  accuracy  margin  widens  to  21%  when  the 
full  64-contract  dataset  is  analyzed.  Of  the  1 ,087  data  points,  EVM 
is  closer  to  the  final  schedule  result  37%  of  the  time,  while  ES  is 
the  more  accurate  technique  58%  of  the  time.  The  EVM  and  ES 
values  are  equivalent  5%  of  the  time.  Thus,  for  both  the  avionics 
subset  and  the  dataset  as  a  whole,  ES  trumps  EVM  in  accuracy. 

The  second  analysis,  shown  in  Figure  2,  depicts  the  frequency 
of  contracts  having  a  particular  percentage  of  their  data  points 
closer  to  the  final  schedule  result.  For  instance,  the  B1  B  Of¬ 
fensive  Avionics  Lot  1  has  1 5  points  where  the  SPI(t)  is  closer 
to  the  final  schedule  result  than  the  SPI($).  There  are  20  data 
points  for  this  program,  so  ES  is  closer  to  the  final  schedule 
result  75%  of  the  time.  As  depicted  in  Figure  2,  this  contract 
is  1  of  9  contracts  where  the  SPI(t)  value  is  closest  to  the  final 
schedule  result  between  70%  and  75%  of  the  time.  There  is  a 
definite  skew  left  to  this  histogram,  demonstrating  the  greater 
accuracy  of  ES.  In  fact,  there  are  only  four  programs  that  have 
less  than  30%  of  their  data  points  with  SPI(t)  values  closer  to 
the  final  schedule  result. 


Figure  1 :  Avionics  Comparison  of  Numbers  of  SPI  Values 
Below  0.90 


20% 

40% 

50% 

60% 

80% 

90% 

SPl(t) 

20 

17 

11 

14 

15 

20 

SPl($) 

12 

11 

4 

5 

2 

1 

Table  2:  Number  of  SPI  Value  Below  0.90  Over  Time 


Table  2  shows  quite  clearly  that  as  early  as  the  20%  pro¬ 
gram  completion  point,  the  ES  metric  was  indicating  a  problem 
more  frequently  than  the  EVM  metric.  Additionally,  this  gulf  in 
detection  exacerbates  over  the  life  of  the  program,  consistent 
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Figure  2:  Distribution  of  Programs  With  ES  Closer  to  Final  Program  Delivery 
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In  addition  to  analyzing  the  contracts  at  an  individual  level, 
we  also  want  to  determine  how  the  entire  portfolio  acts  over  a 
period  of  time.  As  shown  in  Figure  3,  the  ES  metric  dominates 
the  EVM  metric  at  all  program  completion  percentage  points. 
This  result  points  to  ES  providing  valuable  information  to  the 
program  manager. 
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Figure  3:  Comparison  of  SPI  Closer  to  Final  Over  Time 

Other  Schedule  Techniques:  the  Critical  Path 

EVM  is  not  the  only  technique  used  by  DoD  program  man¬ 
gers  to  analyze  schedule.  The  most  common  methodology  is  the 
Critical  Path  Method.  Lipke  [4]  argues  that  Earned  Schedule  is 
applicable  to  the  critical  path.  We  examine  this  finding  in  a  small 
subset  of  our  data.  Our  results  show  a  fundamental  disconnect 
between  the  level  of  Earned  Value  data  collected  and  the  level 
of  Critical  Path  data  utilized  by  the  program  offices.  Specifically, 
we  find  that  earned  value  data  is  collected  at  a  much  higher  level 
than  the  level  in  which  critical  path  analysis  is  being  performed, 
rendering  a  comparison  infeasible.  This  does  not  necessarily 
suggest  that  ES  is  inapplicable  to  the  CPM  in  the  DoD.  Rather,  it 
points  to  the  necessity  of  making  contractor  EVMS  reporting  at 
a  lower  level  as  part  of  contract  deliverables  than  is  typical  today. 
This,  of  course,  would  result  in  increased  contract  costs.  More  re¬ 
search  is  needed  in  this  area  to  determine  that  cost/benefit  ratio. 

Conclusion 

This  paper  has  demonstrated  with  statistical  significance  that 
ES  is  fundamentally  different  from  EVM.  Our  empirical  analy¬ 
ses  of  64  contracts  show  that  not  only  is  there  a  difference 
between  the  two  techniques,  but  that  difference  is  wide  enough 
to  warrant  a  reconsideration  of  the  use  of  ES  in  DoD  programs. 
Specifically,  we  find  ES  to  be  both  timelier  and  more  accurate 
than  traditional  EVM  schedule  analysis. 

The  practical  implications  of  our  research  are  straightforward. 
Due  to  our  inability  to  thoroughly  test  ES  against  CPM,  we  stop 
short  of  recommending  ES  as  its  replacement.  However,  our 
analysis  indicates  ES  warrants  more  intensive  use  for  schedule 
analysis  in  DoD  programs.  Specifically,  based  on  the  findings  of 
our  research,  we  believe  that  DoD  ACAT  I  programs  should  em¬ 
brace  ES  as  a  complementary  tool  (i.e.  the  primary  cross-check) 
to  the  CPM  method  that  is  predominately  utilized.  Traditional  EVM 
schedule  analysis  techniques  should  not  be  abandoned  complete¬ 
ly,  but  should  be  secondary  to  the  CPM  and  ES  techniques. 
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1.  Preliminary  data  analysis  demonstrated  that  there  are  frequent  occurrences  where 
a  program’s  SPI  value  drops  below  0.90  early  in  a  program  and  quickly  recovers. 
This  led  to  the  potential  for  false  conclusions,  necessitating  a  different  analysis. 
Therefore,  to  be  counted  as  “detecting”  a  problem  in  our  analysis,  the  SPI  metric 
must  remain  below  0.90  for  multiple  consecutive  time  periods. 
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